翻訳と辞書
Words near each other
・ Croatian missile boat Kralj Petar Krešimir IV (RTOP-11)
・ Croatian missile boat Šibenik (RTOP-21)
・ Croatian money
・ Croatian months
・ Croatian Mountain Rescue Service
・ Croatian Mountaineering Association
・ Croatian Museum of Naïve Art
・ Croatian Music Channel
・ Croatian music festivals
・ Croatian Music Institute
・ Croatian name
・ Croatian National Alliance
・ Croatian National Assembly
・ Croatian National Badminton Championships
・ Croatian National Bank
Croatian National Corpus
・ Croatian national costume
・ Croatian National Council
・ Croatian National Guard
・ Croatian National Resistance
・ Croatian National Soccer Federation
・ Croatian National Theatre
・ Croatian National Theatre in Mostar
・ Croatian National Theatre in Osijek
・ Croatian National Theatre in Split
・ Croatian National Theatre in Zagreb
・ Croatian National Theatre Ivan pl. Zajc in Rijeka
・ Croatian National Tourist Board
・ Croatian nationalism
・ Croatian nationality law


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Croatian National Corpus : ウィキペディア英語版
Croatian National Corpus
Croatian National Corpus ((クロアチア語:Hrvatski nacionalni korpus), ''HNK'') is the biggest and the most important corpus of the Croatian language. Its compilation started in 1998 at the Institute of Linguistics〔(Institute of Linguistics )〕 of the Faculty of Humanities and Social Sciences, University of Zagreb following the ideas of Marko Tadić. The theoretical foundations and the expression of the need for a general-purpose, representative and multi-million corpus of the Croatian language started to appear even earlier.〔Tadić 1990, (1996 ), (1998 )〕 The Croatian National Corpus is compiled from selected texts written in Croatian covering all fields, topics, genres and styles: from literary and scientific texts to text-books, newspaper, user-groups and chat rooms.
The initial composition was divided in two constituents:
# ''30-million corpus of contemporary Croatian language'' (30m) where samples from texts from 1990 on were included. The criteria for inclusion of text samples were: written by native speakers, different fields, genres and topics. Translated text or poetry were excluded.
# ''Croatian Electronic Text Archive'' (HETA) where the complete text were included, particularly serial publications (volumes, series, editions etc.) which would imbalance the 30m if they were inserted there.
Since 2004, with the adoption of the concept of the 3rd generation corpus, the two-constituent structure has been abandoned in favor of several subcorpora and larger size. Since 2005 HNK 105 million tokens and is composed of number of different subcorpora which can be searched individually and all together in a whole corpus. Since 2004 HNK also migrated to a new server platform, namely Manatee/Bonito server-client architecture. For searching the HNK (today still with free test access) a free client program Bonito〔(Bonito )〕 is needed. It has been produced at the Natural Language Processing Laboratory〔(Natural Language Processing Laboratory )〕 of the Faculty of Informatics,〔(Faculty of Informatics )〕 Masaryk University in Brno, Czech Republic. Its interface features complex and more elaborated queries over corpus, different types of statistical results, total or partial word lists according to different query criteria (with their frequencies), frequency distribution of types, automatic collocation detection etc.
==References==


抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Croatian National Corpus」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.